This analysis is a first pass of the data from the early 2024 DIVA Reproducibility Study. The data were wrangled in the script labelled [100_data_wrangling.html](./100_data_wrangling.html).
Analysis
Setup
Libraries
Getting R libraries.
Code
# R code# Getting essential librarieslibrary("tidyverse")
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library("janitor")
Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
Code
library("here")
here() starts at /home/jovyan/reproducibility_3site
Code
library("ggokabeito")library("zoo")
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Code
library("nlme")
Attaching package: 'nlme'
The following object is masked from 'package:dplyr':
collapse
Variables
Code
# R code# Setting seed and selecting Okabe-Ito color orderset.seed(20240510)okabe_order =c(5,6,1,2,8,7,4,3,9)
Functions
Getting a function to find UTC offset.
Code
# R codeget_utc_offset <-function(ts, ts_utc =NULL, as_numeric =FALSE) {# Throwing an error if timestamp is not a POSIXctstopifnot(inherits(ts, "POSIXct"))if (!is.null(ts_utc)) {stopifnot(inherits(ts_utc, "POSIXct")) } else { ts_utc <- tsattributes(ts_utc)$tzone <-"UTC" }# Forcing the initial time zone to a new time zone char_ts <-format(ts, "%Y-%m-%d %H:%M:%S") force_utc_ts <-as.POSIXct(char_ts, tz ="UTC") diff_secs <-as.double(force_utc_ts) -as.double(ts_utc)# Finding UTC time and using it to produce offset offset_hr <-abs(diff_secs) %/%3600*sign(diff_secs) offset_mn <-abs(diff_secs) %%3600%/%60*sign(diff_secs)# Getting character or numeric outputif (as_numeric) { offset <- offset_hr + offset_mn /60 } else { offset <-paste0(formatC(offset_hr, width =3, flag ="0+"), ":",formatC(abs(offset_mn), width =2, flag ="0") ) }# Returning a properly formatted UTC offset as a character vectorreturn(offset)}
Loading Data
Loading .RDS Files
Loading the .RDS files created in the data wrangling script.
There’s a set of 3 cages from BioMarin that weren’t placed into a recording DAX2 slot until about halfway through the third replicate of the study. These cages are:
R3 cage C5: male J:ARC mouse placed in cage at 2024-03-05 15:24:48 PST
R3 cage B5: male C57BL/6J mouse placed in cage at 2024-03-05 15:24:46 PST
R3 cage A5: female A/J mouse placed in cage at 2024-03-05 15:24:58 PST
Filtering these timepoints out and computing rolling means using TSD on occupancy-normalized activity.
# R codehours_df = repro_activity_1hr_df |>ungroup() |>filter(!(cage_name %in%c("A5","B5","C5") & site =="BioMarin"& replicate =="Replicate 3"& time <ymd_hms("2024-03-05 16:00:00", tz ="America/Los_Angeles")))video_hours =nrow(hours_df)animal_hours =sum(pull(hours_df, animals_cage_quantity))cat("We collected a total of ",prettyNum(video_hours, big.mark =",", scientific =FALSE)," hours (", round(video_hours /24/365.2425, 2)," years) of video documenting ",prettyNum(animal_hours, big.mark =",", scientific =FALSE)," hours (", round(animal_hours /24/365.2425, 2)," years) of mouse home cage behavior.", sep ="")
We collected a total of 25,755 hours (2.94 years) of video documenting 76,495 hours (8.73 years) of mouse home cage behavior.
Making spaghetti plot
Plotting aligned data. Adding a vertical dashed line at 2024-02-27, which was when a major model update that included DAX2 training data went live. Starting with the rolling mean plot.
`summarise()` has grouped output by 'start_date_local', 'cage_id', 'cage_name',
'site', 'replicate', 'strain'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'cage_id', 'cage_name', 'site',
'replicate', 'strain'. You can override using the `.groups` argument.
# R codegenetics = meters_per_day_aov["strain","pve"]sex = meters_per_day_aov["sex","pve"]technical_factors = meters_per_day_aov["site","pve"] + meters_per_day_aov["replicate","pve"] + meters_per_day_aov["strain:site","pve"] + meters_per_day_aov["strain:replicate","pve"]residuals = meters_per_day_aov["Residuals","pve"]cat("Genetics accounts for ", round(genetics, 1), "% of the variance in 24-hour activity.\n","Sex accounts for ", round(sex, 1), "% of the variance in 24-hour activity.\n","Technical factors (site, replicate, and their interaction with genetics) accounts for ",round(technical_factors, 1), "% of the variance in 24-hour activity.\n","Residuals (error) accounts for ", round(residuals, 1), "% of the variance in 24-hour activity.", sep ="")
Genetics accounts for 81.3% of the variance in 24-hour activity.
Sex accounts for 0.1% of the variance in 24-hour activity.
Technical factors (site, replicate, and their interaction with genetics) accounts for 3.9% of the variance in 24-hour activity.
Residuals (error) accounts for 13.1% of the variance in 24-hour activity.
Now plotting these data.
Code
# R codemeters_day_plot =ggplot(data = activity_24h_mean, aes(x = strain, y = meters_day, color = strain)) +geom_boxplot(na.rm =TRUE) + ggbeeswarm::geom_beeswarm() +theme_bw() +theme(legend.position ="none",panel.grid =element_line(color ="#FFFFFF")) +scale_color_okabe_ito(name ="Genetics", order = okabe_order[c(4:9,1:3)]) +xlab("Genetics") +ylab("Average Activity (meters/day)")okabe_order_factors =c("#56B4E9","#D55E00","#009E73","#999999","#0072B2","#CC79A7","#F0E442","#E69F00")meters_day_pve_plot =ggplot(data = meters_per_day_aov,aes(x = pve, y = study, fill = factor)) +geom_bar(stat ="identity", color ="#000000") +scale_fill_manual(name =NULL, values = okabe_order_factors) +theme_bw() +xlab("Percent Variance Explained") +ylab(NULL) +theme(legend.position ="bottom",panel.grid =element_line(color ="#FFFFFF"))meters_day_plot
Doing site-specific ANOVAs. Starting with AbbVie.
Code
# R codeabbvie_meters_per_day_aov =as.data.frame(anova(lm(meters_day ~ strain + sex + replicate + strain:sex + strain:replicate, data = activity_24h_mean |> dplyr::filter(site =="AbbVie")))) |> janitor::clean_names()abbvie_meters_per_day_aov
Doing BioMarin
Code
# R codebiomarin_meters_per_day_aov =as.data.frame(anova(lm(meters_day ~ strain + sex + replicate + strain:sex + strain:replicate, data = activity_24h_mean |> dplyr::filter(site =="BioMarin")))) |> janitor::clean_names()biomarin_meters_per_day_aov
Finally, doing Novartis.
Code
# R codenovartis_meters_per_day_aov =as.data.frame(anova(lm(meters_day ~ strain + sex + replicate + strain:sex + strain:replicate, data = activity_24h_mean |> dplyr::filter(site =="Novartis")))) |> janitor::clean_names()novartis_meters_per_day_aov
Now working with light/dark data.
Computing dark time and light time data frames for these datasets.
`summarise()` has grouped output by 'study_name', 'cage_name', 'replicate',
'site', 'strain', 'sex', 'start_date_local', 'experiment_date'. You can
override using the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain', 'sex'. You can override using
the `.groups` argument.
`summarise()` has grouped output by 'study_id', 'study_name', 'site',
'replicate', 'cage_id', 'cage_name', 'strain'. You can override using the
`.groups` argument.